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We show how to optimally discriminate be- 
tween K distinct quantum states, of which TV 
copies are available, using one-at-a-time interac- 
tions with each of the TV copies. While this task 
(famously) requires joint measurements on all TV 
copies, we show that it can be solved with one-at- 
a-time "coherent measurements" performed by 
an apparatus with log 2 K qubits of quantum mem- 
ory. We apply the same technique to optimal dis- 
crimination between K distinct TV-particle matrix 
product states of bond dimension D, using a co- 
herent measurement apparatus with log 2 K J r \og 2 D 
qubits of memory. 

Quantum state discrimination^^ is the following prob- 
lem: Given TV quantum systems that were all prepared in 
one of K distinct states , . . . , \i/jk)i decide in which 
state they were prepared. Finding the optimal measure- 
ment is a straightforward convex program, in principle. 
But when TV > 1 copies of 1^) are available, it is usually 
a joint measurement on all TV copies. Such measurements 
can be prohibitively difficult. Observing each of the TV 
copies independently yields a strictly lower probability of 
success^! This contrasts starkly with the corresponding 
classical problem of distinguishing K distinct probability 
distributions, where one-at-a-time observations are com- 
pletely sufficient. 

In this paper, we demonstrate that by slightly relaxing 
the usual meaning of "observation", it is possible to do 
optimal discrimination using one-at-a-time observations, 
which restores a pleasing symmetry with the classical 
case. A quantum measurement conventionally comprises: 
(i) a controlled unitary interaction between a system S 
and an apparatus A; (ii) decoherence on A, which forces 
its state into a mixture of "pointer basis" states 5 ; and 
(hi) experimental readout of the classical result from A 
(arguably accompanied by "collapse" of *4's state). We 
relax this prescription by making A a quantum informa- 
tion processor (QIP) - basically a very small (perhaps 
just 1 qubit) non-scalable quantum computer. We pro- 
tect A from decoherence and avoid reading out any infor- 
mation until the very end of the protocol. What remains 
is a coherent measurement, a unitary interaction between 
S and A that transfers information from S to A. 

We begin by showing how to realize optimal discrimi- 
nation between TV copies of K pure states with one-at-a- 
time coherent measurements, using a log 2 if -qubit QIP. 
Next, we apply the same technique to optimal discrimina- 
tion of many-body matrix product states (MPS). Our pro- 
tocol distinguishes between K distinct TV-particle MPS 
with bond dimension D, and uses a (log 2 K + log 2 D)- 



qubit QIP. Finally, we combine our first two results to 
get a protocol for discriminating between M copies of K 
distinct MPS using a log 2 K + log 2 D qubit QIP. 

I. DISCRIMINATING TV-COPY STATES 

Suppose we are given TV quantum systems (Si . . . Sn) 
with (i-dimensional Hilbert spaces H n , and a promise 
that they were all identically prepared in one of K 
nonorthogonal states (l^i) . . . \iPk)}- Their joint state 
is \%jjk)® N G %® N \ with k unknown. Identifying k with 
maximum success probability requires a joint measure- 
ment on all TV samples. Non-adaptive one-at-a-time mea- 
surement cannot achieve the optimal success probability. 
For K = 2 candidate states, there is an adaptive local 
measurement scheme that achieves the optimal success 
probability 6 , but no such protocol has been found for 
K > 2 (despite some effort 7 - which suggests, but cer- 
tainly does not prove, that no such protocol exists). 

All the information about k is contained in a Tri- 
dimensional subspace 

/C JV = Span({|V fe ) 0Ar }). (1) 

So while the optimal measurement is a joint measure- 
ment, it does not need to explore the majority of H. We 
will implement it by rotating the entire subspace JC into 
the state space of our if-dimensional QIP A (the coher- 
ent measurement apparatus). We do so via sequential 
independent interactions between A and each of the TV 
samples <S n , "rolling up" all information about k into A. 

A is initially prepared in the |0) state. We bring it into 
contact with Si, and execute a SWAP gate between Si 
and the {|0) , . . . , \d — 1)} subspace of A. This transfers 
all information from the first sample into A, leaving Si 
in the |0) state. 

Now we bring A into contact with <S 2 . Their joint 
state is l^) 02 , although we do not know k. But we do 
know that the state lies within JC2 = Spand^) 02 }), 
(see Eq. [I]), whose dimension is at most K. A basis 
{\4>j) : j = 1 • • • K} for this space can be obtained by 
Gram-Schmidt orthogonalization. We apply a unitary 
interact iorP^ between A and £2 , 

U 2 = Y,\ Q sJa){H- (2) 
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It rotates JC2 into {|0)}s 2 (g> %Ai which places all the in- 
formation about k in A and decouples £2 • (S2 is left with 
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no information about k if and only if A is left with all 
the information about k.) A is now in one of K possible 

states V^ 2 ^? which (as a set) are unitarily equivalent to 

{hM 02 }-e.g., (^fl^ 2) ) = <VvhM 2 - 

The rest of the algorithm is now fairly obvious; we 
move on and interact A with £3 in the same way, etc, 
etc. At each step, when A comes into contact with <S n , 

their joint state is l^i™" 1 ^ ® l^fc)- These K alternatives 

span a if -dimensional space K n (see Eq. [I]), spanned by 



Unknown MPS 



Model as one MPS w/reference 



a basis 



by applying 



{ I^j' 7 ^) }' wn ^ cn i s then rotated into {|0)}^ <&T-La 



(n) 



Each sample system is left in the |0) state, indicating that 
all its information has been extracted. After every sam- 
ple has been sucked dry, we simply measure A to extract 
k. This final measurement can be efficiently computed 
via convex programming, since A is only if -dimensional. 

The sequence of coherent measurement interactions is 
independent of what sort of discrimination we want to do 
- e.g., minimum-erroffi unambiguous discrimination^^, 
maximum- confidence^, etc - because JCn is a sufficient 
statistic for any inference about fc, and our protocol sim- 
ply extracts it whole, leaving the decision rule up to the 
final measurement on A. As in the classical case, data 
gathering can now be separated from data analysis. 

This protocol can be modified to discriminate non- 
symmetric product states - e.g., (g) ^2) • • • \^n) 
vs. |<M<8>|02>®...|<M. 



II. MATRIX PRODUCT STATE 
DISCRIMINATION 

The information about k can be "rolled up" using se- 
quential interactions because it is contained in a sub- 
space /Cat with Schmidt rank 26 at most K across any 
division of the N systems. Low Schmidt rank is criti- 
cal. Consider distinguishing between two states that are 
each maximally entangled between the first N/2 samples 
and the last N/2 samples. They lie in a 2-dimensional 
subspace, but it is not accessible through our protocol. 
The first N/2 samples are maximally entangled with the 
rest, so their reduced state has rank d N / 2 . At least 
N\og 2 d/2 qubits would be needed to store the informa- 
tion extracted from the first N/2 samples. 

But whenever the Schmidt rank condition is satisfied, 
a variation of our algorithm will work. For product states 
(above), each state has Schmidt rank 1, and the span of 
K such states has Schmidt rank at most K. 

This property of low Schmidt rank is generalized by 
matrix product states (MPS)PE3. An A^-particle MPS 
with bond dimension D is guaranteed to have Schmidt 
rank at most D across any division of the ID lattice. 
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{|^i> or ^2) or . . . \ip K )} 



First "coherent measurement" 
SWAP pi -> Pa 



2nd coherent 
measurement: 

map pA2 -> PA 





Final state: correlation 
between {1 ..N} and R 
all transferred into A 



FIG. 1: Our protocol represents an unknown MPS \ipk) from 
a set (l^i) . . . \iPk)} by its purification - a single MPS |^) 
involving a fictitious reference R. The algorithm then succes- 
sively decorrelates each sample S n from the rest, storing <S n 's 
correlations with the remainder of the lattice in the "appara- 
tus" A. Ultimately, all information about k (i.e., correlation 
with R) is contained in A, which can be measured. 



Thus, the span of K such MPS, each with bond dimen- 
sion < D, has Schmidt rank at most DK. We denote 
such a subspace, 

JC = Span({|^/e)}), 

a matrix product subspace with bond dimension DK. 
Such a set of MPS can be distinguished optimally with 
coherent measurements and (log 2 D + log 2 K) qubits. 

Our algorithm is a straightforward generalization of 
the one for product states, and proceeds as shown in 
Figure [I] First, we represent the MP subspace JC by 
its purification - a single MPS for Si . . .Sn ctnd a 
fictitious reference system i?, 



Si •••Sn 



(3) 



Information about /c, which is contained in /C, equates to 
correlation with the imaginary R. 

Now, we initialize A in the |0) state, then SWAP its 
state with that of Si (the first lattice site). This decou- 
ples Si , leaving A S2 • • • Sn <8> R in a matrix product 
state, 



|oUI*W..5„,*-Ho>ir*> 



A,S 2 ...Sn,R ' 



(4) 



with Schmidt rank no greater than DK. 

Now, to roll up each successive site S n (n = 2, . . . , iV), 
we find the Schmidt decomposition of the current state 
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between A (g) S n and the remainder of the lattice, write 
it (generically) as 

DK 

(5) 

and apply a unitary operation 27 to A <S n , 

U n = J2\°sjA)(H\, (6) 

3 

which decouples S n and leaves all the information previ- 
ously in S n <S> A in A. Doing this successively at each site 
decorrelates all the <S n , and we are left in the state 

k 

with all information about k in A, where it can be ex- 
tracted by a simple measurement. 

Recent proposals for local tomograph}^ are also based 
on sequential interactions. Our protocol, with coherent 
measurements, offers a tremendous efficiency advantage 
(at the cost of requiring a small QIP!). It can distinguish 
near-orthogonal MPS states with a single copy, whereas 
local tomography requires O(N) copies. Distinguishing 
non-orthogonal states requires multiple (M) copies. To 
apply our algorithm, we simply line up the copies (they 
do not have to exist simultaneously) and treat them as 
a single TVM-particle MPS of bond dimension D. 

III. MIXED STATE DISCRIMINATION AND 
TOMOGRAPHY 

In the context of TV-copy states (Section [l]), one may 
ask: 

1. Can coherent measurement be used to discriminate 
mixed states, i.e. pf N ? 

2. Can coherent measurement be used for full state 
tomography (rather than discrimination)? 

The answer to both is "Yes, but it seems to require an 
0(log TV)-qubit apparatus." This is a very favorable scal- 
ing, but less remarkable (and less immediately useful) 
than the O(l) scaling for pure state discrimination. 

This is possible because the order of the samples is 
completely irrelevant. As we scan through the sam- 
ples, we can discard ordering information, keeping only 
a sufficient statistic for inference about p. The quantum 
Schur transform does this^U. It is based on Schur-Weyl 
duality 18 , which states that because the TV-copy Hilbert 
space T~Lf N : permutations of the samples commute with 
collective rotations of all TV samples, the Hilbert space 
can be refactored as 

A 



The U\ factors are irreducible representation spaces (ir- 
reps) of SU(d), the V\ factors are irreps of Sn, and A 
labels the various irreps. The Schur transform can be 
implemented by a unitary circuit that acts sequentially 
on the samples, mapping TV qudit registers into three 
quantum registers containing (respectively) A, U, and V: 

The "ordering" register H-p accounts for nearly all the 
Hilbert space dimension of Hf N , and since it is irrele- 
vant to inference it can be discarded as rapidly as it is 
produced. What remains to be stored in A is: 

1. a "label" register A (a sufficient statistic for the 
spectrum of p), 

2. a SU(d) register U (a sufficient statistic for the 
eigenbasis of p). 

The A register requires a Hilbert space with dimension > 
the number of Young diagrams with TV boxes in at most 
d rows, which is approximately 

The U register must hold the largest TV-copy irrep of 
SU(d), whose size can be calculated using hook- length 
formulae^, and upper bounded by 

dim max (^) = (TV + d-l)^^- 1 ). 

Together, these registers require 0(d 2 logTV) qubits of 
memory (although for pure state tomography, 0(d log TV) 
qubits of memory are sufficient). 

O (log TV) memory appears to be necessary for optimal 
accuracy. Consider the simplest possible case - discrim- 
ination of two classical 1-bit probability distributions 



The sufficient statistic is frequencies of "0" and "1", 
{n, TV — n}. For any given problem, there is a threshold 
value n c such that the answer depends only on whether 
n < n c , so only one bit of information is required. How- 
ever, extracting that bit via sequential queries requires 
storing n exactly at every step (using O (log TV) bits of 
memory). Any loss of precision could cause a ±1 error at 
the final step, and thus a wrong decision. In this exam- 
ple, classical storage is sufficient. But in the general case, 
where the candidate pk do not commute, no method is 
known to compress the intermediate data into classical 
memory without loss (previous work suggests it is prob- 
ably impossible^). 

IV. DISCUSSION: APPLICATIONS, 
IMPLEMENTATIONS, AND IMPLICATIONS 

Quantum information science is rife with gaps between 
what is theoretically achievable and what is practically 
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achievable. Our algorithm eliminates performance gaps 
for pure state discrimination with local measurements 
- but it requires a new kind of measurement apparatus 
with at least 1 controllable qubit of quantum storage. Its 
utility depends on its applications, and on the difficulty 
of implementation. 

APPLICATIONS: One immediate application of our 
protocol is detection of weak forces and transient ef- 
fects. A simple force detector (e.g., for magnetic fields) 
might comprise a large array of identical systems (e.g., 
|^) spins). Each system is only weakly perturbed by the 
force, so information about the force is distributed across 
the entire array. Our algorithm efficiently gathers up that 
information with no loss - whereas local measurements 
with classical processing waste much information. 

A more sensitive TV-particle "antenna" would incor- 
porate entanglement between the TV particles 20 . High 
sensitivity can be achieved by simple MPS states with 
D = 2, like NOON stateeP, 

|TV,0) + |0,TV) \tf N + \lf N 

Collective forces do not change D, so the final states to be 
discriminated are also MPS. Our approach can discrimi- 
nate such states and it can be used to prepare them, by 
running the "rolling up" process in reverse 22 . 

More ambitious applications include direct probing of 
many-body states, to test a particular MPS ansatz for 
a lattice system, or to characterize results of quantum 
simulations in optical lattices or ion traps. Without 
fully scalable quantum computers that can couple di- 
rectly to many-body systems, coherent measurements 
may be the only way to efficiently probe complex TV- 
particle states. Our protocol does not obviously scale to 
PEPS, the higher-dimensional analogues of MPS 14 . Like 
MPS, PEPS obey an area law - entanglement across a 
cut scales not with the volume of the lattice (TV), but 
with the area of the cut. For a 1-dimensional MPS on TV 
systems, any cut has area 1, so the Schmidt rank scales 
as O(l), and our algorithm requires an 0(1) qubit QIP. 
Rolling up a general PEPS on an n-dimensional lattice 
would require O(TV^) bits of quantum memory How- 
ever, some PEPS can be sequentially generated 2 ^ 3 , and 
are likely amenable to our protocol. 

IMPLEMENTATIONS: Engineering requirements 
for a coherent measurement apparatus are achievable 
with near-future technology. A should be a clean K- 
dimensional quantum system with: 

1. Universal local control, 

2. Long coherence time relative to the gate timescale, 

3. Controllable interaction with an external d- 
dimensional "sample" system, 

4. Sequential coupling to each of TV samples, 



5. Strong measurements (which may be destructive). 

K = 2 is sufficient for proof-of-principle, but K > 3 
would be more exciting because adaptive local measure- 
ments can discriminate K = 2 states. 

These requirements are much weaker than those for 
scalable quantum computing. Coherent measurement 
could be an early application for embryonic quantum ar- 
chitectures. Furthermore, scalability is not required, just 
a single if-dimensional system. Only local control has 
to be universal, since the interaction with external sys- 
tems is limited. Error correction is not mandatory, for 
coherence need only persist long enough to interact with 
each of the TV systems of interest. Since measurements 
are postponed until the end, they can be destructive. 

We do require A to be portable - i.e., sequentially 
coupled to each of the TV samples - whereas a quantum 
computer can be built using only nearest neighbor inter- 
actions. Fortunately, most proposed architectures have 
selective coupling either through frequency space (NMR, 
ion traps with a phonon bus) or physical motion of the 
qubits (some ion traps) or flying qubits (photonic archi- 
tectures). Devices that are not viable for full-scale quan- 
tum computing may be even better for coherent measure- 
ment. For example, an STM might pick up and transport 
a single coherent atomic spin along an array of sample 
atoms, interacting sequentially with each of them. 

IMPLICATIONS: Coherent measurements are a gen- 
uinely new way to gather information. We have not just 
removed collapse from standard quantum measurements! 
That kind of coherent measurement is used already in 
quantum error correction, where it's common to replace 
a measurement of X with a controlled unitary of the form 

X 

Such unitaries transfer information about a specific ob- 
servable X from S to A. For appropriate l^o)^ and 
later measurements of A produce exactly the same result 
as if S had been measured directly. The coherent mea- 
surements in our discrimination protocols are not of this 
form. They do not measure (i.e., transfer information 
to A about) a specific basis. For example, in TV-copy 
state discrimination, A interacts with the first sample by 
a SWAP operation, which has no preferred basis. Later 
interactions are also not of controlled-U form (Eq. [7]). 

One might ask where the "measurement" occurs, since 
the interaction between S and A is purely unitary. The 
essence of measurement is that an observer or apparatus 
gains information. Quantum measurements are usually 
construed as mysterious processes that consume quan- 
tum states and excrete specific, definite measurement 
outcomes. Quantum theories of measurements usually 
represent them as (i) unitary interaction, (ii) decoherence 
and superselection, and finally (iii) wavefunction collapse 
or splitting of the universe 28 . Our results suggest that 
unitary interaction (the only part of this sequence that 
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is really understood) can stand alone as an information- 
gathering "measurement." And by avoiding decoherence, 
we can gather information strictly more effectively. 

Decoherence is ubiquitous in human experience. But 
in its absence, there is no compelling reason why gath- 
ering information must be accompanied by collapse or 
definite outcomes. The whole point of quantum informa- 
tion science is to produce devices that do not decohere, 
and that can process information more efficiently than 
classical computers. The central message of this paper 
is that they can also gather information more efficiently. 
Unfettered by decoherence, they may still be constrained 



by locality. For such devices, coherent measurements are 
the natural way to gather information. 
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